Team #256 - CellFinder Text Mining Pipeline
ثبت نشده
چکیده
This document describes the CellFinder text mining pipeline in the scope of curation of gene expression data in cells and anatomical parts. The CellFinder database is a repository of cell research which aims to integrate data derived from many sources, such as literature curation and microarrays experiments. In scientific publications, curatable gene expression events correspond to text passages which show associations between a gene/protein, a certain cell/anatomical part and an expression trigger, i.e., a word which indicates that the event is taking place. The sentence below illustrates one such example (PMID 18989465):
منابع مشابه
Original article Preliminary evaluation of the CellFinder literature curation pipeline for gene expression in kidney cells and anatomical parts
Biomedical literature curation is the process of automatically and/or manually deriving knowledge from scientific publications and recording it into specialized databases for structured delivery to users. It is a slow, error-prone, complex, costly and, yet, highly important task. Previous experiences have proven that text mining can assist in its many phases, especially, in triage of relevant d...
متن کاملPreliminary evaluation of the CellFinder literature curation pipeline for gene expression in kidney cells and anatomical parts
Biomedical literature curation is the process of automatically and/or manually deriving knowledge from scientific publications and recording it into specialized databases for structured delivery to users. It is a slow, error-prone, complex, costly and, yet, highly important task. Previous experiences have proven that text mining can assist in its many phases, especially, in triage of relevant d...
متن کاملEvaluation of the CellFinder pipeline in the BioCreative IV User Interactive task
We present results on the participation of the CellFinder text mining pipeline for curation of gene/protein expression in anatomical parts in the BioCreative IV User Interactive task. The pipeline integrates state-of-the-art and freely available tools for the following steps: triage of potentially relevant documents, retrieval of documents, preprocessing, named-entity recognition, event extract...
متن کاملCellFinder: a cell data repository
CellFinder (http://www.cellfinder.org) is a comprehensive one-stop resource for molecular data characterizing mammalian cells in different tissues and in different development stages. It is built from carefully selected data sets stemming from other curated databases and the biomedical literature. To date, CellFinder describes 3394 cell types and 50 951 cell lines. The database currently contai...
متن کاملOntoPDF: using a text mining pipeline to generate enriched pdf versions of scientific papers
In this poster we present a recent extension of the OntoGene text mining utilities, which enables the generation of annotated pdf versions of the original articles. While a text-based view (in XML or HTML) can allow a more flexible presentation of the results of a text mining pipeline, for some applications, notably in assisted curation, it might be desirable to present the annotations in the c...
متن کامل